Skip to main content

Overview

Fetches NSE surveillance lists (ASM and GSM) from Google Sheets (primary) with dual fallbacks to Dhan’s Next.js API and web scraping. These lists identify stocks under regulatory surveillance due to price volatility or other concerns. Source: fetch_surveillance_lists.py
Phase: Phase 2 (Enrichment)
Output: nse_asm_list.json, nse_gsm_list.json

Data Sources

The script implements a 3-tier fallback strategy:

Primary Source: Google Sheets Gviz API

GET https://docs.google.com/spreadsheets/d/1zqhM3geRNW_ZzEx62y0W5U2ZlaXxG-NDn0V8sJk5TQ4/gviz/tq?tqx=out:json&gid={gid}
gid
string
required
Google Sheet tab ID:
  • ASM List: 290894275
  • GSM List: 1525483995

Secondary Source: Dhan Next.js JSON API

GET https://dhan.co/_next/data/{buildId}/{data_key}.json
buildId
string
required
Dynamically fetched from https://dhan.co/all-indices/ page source
data_key
string
required
  • ASM: nse-asm-list
  • GSM: nse-gsm-list

Tertiary Source: Web Scraping

Fallback scraping from:
  • https://dhan.co/nse-asm-list/
  • https://dhan.co/nse-gsm-list/
Extracts data from <script id="__NEXT_DATA__"> JSON block.

Configuration

lists_config = {
    "nse_asm_list.json": {
        "gid": "290894275",
        "web_url": "https://dhan.co/nse-asm-list/",
        "data_key": "nse-asm-list"
    },
    "nse_gsm_list.json": {
        "gid": "1525483995",
        "web_url": "https://dhan.co/nse-gsm-list/",
        "data_key": "nse-gsm-list"
    }
}

Function Signatures

get_build_id()

def get_build_id():
    """
    Dynamically fetch the Next.js buildId.
    
    Returns:
        str: Build ID string, or None if extraction fails
    """
Extracts buildId from page source using regex:
match = re.search(r'"buildId":"([^"]+)"', response.text)
return match.group(1) if match else None

fetch_surveillance_lists()

def fetch_surveillance_lists():
    """
    Main function that fetches both ASM and GSM lists using 3-tier fallback.
    Writes nse_asm_list.json and nse_gsm_list.json to current directory.
    """

Output Structure

Symbol
string
Stock trading symbol (e.g., “YESBANK”)
Name
string
Company display name
ISIN
string
ISIN code of the security
Stage
string
Surveillance stage (e.g., “LTASM”, “STASM”, “GSM”)

Example Output (ASM List)

[
  {
    "Symbol": "YESBANK",
    "Name": "Yes Bank Limited",
    "ISIN": "INE528G01035",
    "Stage": "LTASM"
  },
  {
    "Symbol": "SUZLON",
    "Name": "Suzlon Energy Limited",
    "ISIN": "INE040H01021",
    "Stage": "STASM"
  }
]

Data Extraction Logic

From Gviz API Response

text = response.text
match = re.search(r'setResponse\((.*)\);', text)
if match:
    data = json.loads(match.group(1))
    rows = data.get('table', {}).get('rows', [])
    
    for row in rows:
        c = row.get('c', [])
        if len(c) >= 5:
            symbol = c[1].get('v') if c[1] else None
            name = c[2].get('v') if c[2] else None
            isin = c[3].get('v') if c[3] else None
            stage = c[4].get('v') if c[4] else None

From Next.js JSON

def find_list(obj):
    if isinstance(obj, list) and len(obj) > 3:
        if isinstance(obj[0], dict) and ('sym' in obj[0] or 'Sym' in obj[0]):
            return obj
    if isinstance(obj, dict):
        for v in obj.values():
            res = find_list(v)
            if res: return res
    return None

Dependencies

  • requests — HTTP client
  • json — JSON parsing
  • re — Regex for buildId and Gviz extraction
  • BeautifulSoup — HTML parsing for fallback scraping

Error Handling

  • Graceful fallback: if Gviz fails, tries Next.js JSON; if that fails, scrapes webpage
  • Returns empty list on total failure with error message
  • 10-second timeout for all HTTP requests
  • Skips header row (where Symbol == "Symbol")
if symbol == "Symbol" or not symbol:
    continue

Usage Example

python3 fetch_surveillance_lists.py
Expected Output:
Primary Fetch: Gviz API (Spreadsheet) for nse_asm_list.json...
Successfully saved 142 items via Gviz.
Primary Fetch: Gviz API (Spreadsheet) for nse_gsm_list.json...
Successfully saved 28 items via Gviz.

Integration

This script is part of Phase 2 (Enrichment) in the EDL Pipeline. The output files are consumed by:
  • add_corporate_events.py — Adds ”★: LTASM / STASM” event markers to stocks
Run via master pipeline:
python3 run_full_pipeline.py